-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dashboard] Remove gpustats dependencies from Ray[default] #41044
Conversation
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
16adb6d
to
13fb433
Compare
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg. Could you update the PR description?
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
@ericl needs your approval here as code owner. Context: we are removing GPUtil and gpustats dependencies with a vendored in pyvnml single file library (the offical nvidia library that other libraries use internally) so that it works with minimal ray and it also removes unnecessary transitive dependencies included by gpustats for terminal display. |
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
memory_used=int(pynvml.nvmlDeviceGetMemoryInfo(gpu_handle).used) | ||
// MB, | ||
memory_total=int(pynvml.nvmlDeviceGetMemoryInfo(gpu_handle).total) | ||
// MB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could merge these two nvmlDeviceGetMemoryInfo
calls into one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, in NVIDIA driver 510.39.01, a v2 memory info API was added:
https://github.com/NVIDIA/nvidia-settings/blob/510.39.01/src/nvml.h#L218-L241
The unversioned API (v1) and v2 API will return different results on R510+ drivers.
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
c2b4a50
to
1e0f5ad
Compare
# nvdia-ml-py version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment saying why we pick this version: something like we are using this version because it uses v2 api and supports a wider range of drivers.
Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always great to see deps removed
…ct#41044) Add method to get gpu utilization similarly on how gpustats did, and remove gpustats from ray[default] dependencies. Signed-off-by: Jonathan Nitisastro <jonathancn@anyscale.com>
…ay-project#41044)" (ray-project#41375) Reverts ray-project#41044 premerge is busted and potentially blocking people from merging into branch cut. Revert to unblock Failing test: linux://python/ray/tests:test_streaming_generator_2 This reverts commit 9b9fb55.
Why are these changes needed?
Add method to get gpu utilization similarly on how gpustats did, and remove gpustats from ray[default] dependencies.
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.